replicate()
for SimulationOne of the simplest and most common examples of a random phenomenon is a coin flip: an event that is either “yes” or “no” with some probability. Here you’ll learn about the binomial distribution, which describes the behavior of a combination of yes/no trials and how to predict and simulate its behavior.
In these exercises, you’ll practice using the rbinom()
function, which generates random “flips” that are either 1 (“heads”) or 0 (“tails”).
*With one line of code, simulate 10 coin flips, each with a 30% chance of coming up 1 (“heads”).
# Generate 10 separate random flips with probability .3
rbinom(10,1,.3)
FALSE [1] 1 0 0 0 1 1 0 0 0 0
In the last exercise, you simulated 10 separate coin flips, each with a 30% chance of heads. Thus, with rbinom(10, 1, .3)
you ended up with 10 outcomes that were either 0 (“tails”) or 1 (“heads”).
But by changing the second argument of rbinom()
(currently 1
), you can flip multiple coins within each draw. Thus, each outcome will end up being a number between 0 and 10, showing the number of flips that were heads in that trial.
Use the rbinom()
function to simulate 100 separate occurrences of flipping 10 coins, where each coin has a 30% chance of coming up heads.
# Generate 100 occurrences of flipping 10 coins, each with 30% probability
rbinom(100,10,.3)
FALSE [1] 2 2 0 4 4 5 3 2 3 2 2 3 4 4 3 4 6 1 2 1 1 3 3 2 4 3 2 3 3 2 3 4 2 2 4
FALSE [36] 0 2 5 2 2 6 3 1 3 1 2 1 2 3 2 3 2 1 2 5 3 4 3 2 3 3 6 3 2 1 3 1 5 2 2
FALSE [71] 4 5 0 3 1 3 3 4 0 3 3 4 3 4 5 5 4 1 1 4 1 4 2 4 3 4 3 2 1 2
If you flip 10 coins each with a 30% probability of coming up heads, what is the probability exactly 2 of them are heads?
Answer the above question using the dbinom()
function. This function takes almost the same arguments as rbinom()
. The second and third arguments are size
and prob
, but now the first argument is x
instead of n
. Use x
to specify where you want to evaluate the binomial density.
Confirm your answer using the rbinom()
function by creating a simulation of 10,000 trials. Put this all on one line by wrapping the mean()
function around the rbinom()
function.
# Calculate the probability that 2 are heads using dbinom
dbinom(2,10,.3)
FALSE [1] 0.2334744
# Confirm your answer with a simulation using rbinom
mean(rbinom(10000,10,.3)==2)
FALSE [1] 0.2391
If you flip ten coins that each have a 30% probability of heads, what is the probability at least five are heads?
Answer the above question using the pbinom()
function. (Note that you can compute the probability that the number of heads is less than or equal to 4, then take 1 - that probability).
Confirm your answer with a simulation of 10,000 trials by finding the number of trials that result in 5 or more heads.
# Calculate the probability that at least five coins are heads
1-pbinom(4,10,.3)
FALSE [1] 0.1502683
# Confirm your answer with a simulation of 10,000 trials
mean(rbinom(10000,10,.3)>=5)
FALSE [1] 0.1497
In the last exercise you tried flipping ten coins with a 30% probability of heads to find the probability *at least five are heads. You found that the exact answer was ‘1 - pbinom(4, 10, .3)’ = 0.1502683, then confirmed with 10,000 simulated trials.
Did you need all 10,000 trials to get an accurate answer? Would your answer have been more accurate with more trials?
Try answering this question with simulations of 100, 1,000, 10,000, 100,000 trials, so you can see which is the closest to the exact answer.
# Here is how you computed the answer in the last problem
mean(rbinom(10000, 10, .3) >= 5)
FALSE [1] 0.1437
# Try now with 100, 1000, 10,000, and 100,000 trials
mean(rbinom(100, 10, .3) >= 5)
FALSE [1] 0.15
mean(rbinom(1000, 10, .3) >= 5)
FALSE [1] 0.156
mean(rbinom(10000, 10, .3) >= 5)
FALSE [1] 0.1501
mean(rbinom(100000, 10, .3) >= 5)
FALSE [1] 0.14969
What is the expected value of a binomial distribution where 25 coins are flipped, each having a 30% chance of heads?
Calculate this using the exact formula you learned in the lecture: the expected value of the binomial is size * p. Print this result to the screen.
Confirm with a simulation of 10,000 draws from the binomial.
# Calculate the expected value using the exact formula
25*.3
FALSE [1] 7.5
# Confirm with a simulation using rbinom
mean(rbinom(10000,25,.3))
FALSE [1] 7.5268
What is the variance of a binomial distribution where 25 coins are flipped, each having a 30% chance of heads?
Calculate this using the exact formula you learned in the lecture: the variance of the binomial is size * p * (1 - p). Print this result to the screen.
Confirm with a simulation of 10,000 trials.
# Calculate the variance using the exact formula
25*.3*(1-.3)
FALSE [1] 5.25
# Confirm with a simulation using rbinom
var(rbinom(10000,25,.3))
FALSE [1] 5.278581
In this chapter you’ll learn to combine multiple probabilities, such as the probability two events both happen or that at least one happens, and confirm each with random simulations. You’ll also learn some of the properties of adding and multiplying random variables.
If events A and B are independent, and A has a 40% chance of happening, and event B has a 20% chance of happening, what is the probability they will both happen?
Hint: To find the probability independent events A and B both happen, multiply their probabilities.
You can also use simulation to estimate the probability of two events both happening.
Randomly simulate 100,000 flips of coin A, each of which has a 40% chance of being heads. Save this as a variable A
.
Randomly simulate 100,000 flips of coin B, each of which has a 20% chance of being heads. Save this as a variable B
.
Use the “and” operator (&
) to combine the variables A
and B
to estimate the probability that both A and B are heads.
# Simulate 100,000 flips of a coin with a 40% chance of heads
A <- rbinom(100000, 1, .4)
# Simulate 100,000 flips of a coin with a 20% chance of heads
B <- rbinom(100000, 1, .2)
# Estimate the probability both A and B are heads
mean(A & B)
FALSE [1] 0.07967
Randomly simulate 100,000 flips of A (40% chance), B (20% chance), and C (70% chance). What fraction of the time do all three coins come up heads?
You’ve already simulated A and B. Now simulate 100,000 flips of coin C, where each has a 70% chance of coming up heads.
Use A
, B
, and C
to estimate the probability that all three coins would come up heads.
# You've already simulated 100,000 flips of coins A and B
A <- rbinom(100000, 1, .4)
B <- rbinom(100000, 1, .2)
# Simulate 100,000 flips of coin C (70% chance of heads)
C <- rbinom(100000, 1, .7)
# Estimate the probability A, B, and C are all heads
mean(A&B&C==1)
FALSE [1] 0.05593
If coins A and B are independent, and A has a 60% chance of coming up heads, and event B has a 10% chance of coming up heads, what is the probability either A or B will come up heads?
Hint: The probability of A or B happening (when A and B are independent, as they are here) is P(A) + P(B) - P(A) * P(B).
In the last exercise, you found that there was a 64% chance that either coin A (60% chance) or coin B (10% chance) would come up heads. Now you’ll confirm that answer using simulation.
Use rbinom()
to simulate 100,000 flips of coin A, each having a 60% chance of being heads.
Use rbinom()
to simulate 100,000 flips of coin B, each having a 10% chance of being heads.
Use these to estimate the probability that A or B is heads.
# Simulate 100,000 flips of a coin with a 60% chance of heads
A <- rbinom(100000,1,.6)
# Simulate 100,000 flips of a coin with a 10% chance of heads
B <- rbinom(100000,1,.1)
# Estimate the probability either A or B is heads
mean(A|B==1)
FALSE [1] 0.63986
Suppose X is a random Binom(10, .6) variable (10 flips of a coin with 60% chance of heads) and Y is a random Binom(10, .7) variable (10 flips of a coin with a 70% chance of heads), and they are independent.
What is the probability that either of the variables is less than or equal to 4?
Simulate 100,000 draws from each of X (10 coins, 60% chance of heads) and Y (10 coins, 70% chance of heads) binomial variables, saving them as X
and Y
respectively.
Use these simulations to estimate the probability that either X or Y is less than or equal to 4.
Use the pbinom()
function to calculate the exact probability that X is less than or equal to 4, then the probability that Y is less than or equal to 4.
Combine these two exact probabilities to calculate the exact probability that either variable is less than or equal to 4.
# Use rbinom to simulate 100,000 draws from each of X and Y
X <-rbinom(100000,10,.6)
Y <- rbinom(100000,10,.7)
# Estimate the probability either X or Y is <= to 4
mean(X<=4|Y<=4)
FALSE [1] 0.20478
# Use pbinom to calculate the probabilities separately
prob_X_less <- pbinom(4,10,.6)
prob_Y_less <- pbinom(4, 10, .7)
# Combine these to calculate the exact probability either <= 4
prob_X_less+prob_Y_less-(prob_Y_less*prob_X_less)
FALSE [1] 0.2057164
If X is a binomial with size 50 and p = .4, what is the expected value of 3*X?
Hint: The expected value of a binomial is size * p, and the expected value of k * X is k * E[X].
In this exercise you’ll use simulation to confirm the rule you just learned about how multiplying a random variable by a constant effects its expected value.
Simulate 100,000 draws of X, a binomial random variable with size 20 and p = .1. Save this as X
Use this simulation to estimate the expected value of X.
Use this simulation to estimate the expected value of 5*X, as well.
# Simulate 100,000 draws of a binomial with size 20 and p = .1
X <- rbinom(100000,20,.1)
# Estimate the expected value of X
mean(X)
FALSE [1] 2.00379
# Estimate the expected value of 5 * X
mean(5*X)
FALSE [1] 10.01895
In the last exercise you simulated X from a binomial with size 20 and p = .1 and now you’ll use this same simulation to explore the variance.
Use this simulation to estimate the variance of X.
Estimate the variance of 5 * X
# X is simulated from 100,000 draws of a binomial with size 20 and p = .1
X <- rbinom(100000, 20, .1)
# Estimate the variance of X
var(X)
FALSE [1] 1.797931
# Estimate the variance of 5 * X
var(5*X)
FALSE [1] 44.94829
If X is drawn from a binomial with size 20 and p = .3, and Y from size 40 and p = .1, what is the expected value (mean) of X + Y?
Hint: Compute the expected value of X and the expected value of Y separately, then add them together.
In the last exercise, you found the expected value of the sum of two binomials. In this problem you’ll use a simulation to confirm your answer.
Simulate 100,000 draws from X, a binomial with size 20 and p = .3, and Y, with size 40 and p = .1.
Use this simulation to estimate the expected value of X + Y.
# Simulate 100,000 draws of X (size 20, p = .3) and Y (size 40, p = .1)
X <-rbinom(100000,20,.3)
Y <-rbinom(100000,40,.1)
# Estimate the expected value of X + Y
mean(X+Y)
FALSE [1] 9.99454
In the last multiple choice exercise, you examined the expected value of the sum of two binomials. Here you’ll estimate the variance.
Use your simulation of the variables X and Y to estimate the variance of X + Y
. Use your simulation to estimate the variance of 3 * X + Y
.
# Simulation from last exercise of 100,000 draws from X and Y
X <- rbinom(100000, 20, .3)
Y <- rbinom(100000, 40, .1)
# Find the variance of X + Y
var(X+Y)
FALSE [1] 7.816989
# Find the variance of 3 * X + Y
var(3*X+Y)
FALSE [1] 41.64792
Suppose you have a coin that is equally likely to be fair (50% heads) or biased (75% heads). You then flip the coin 20 times and see 11 heads.
Without doing any math, which do you now think is more likely- that the coin is fair, or that the coin is biased?
We see 11 out of 20 flips from a coin that is either fair (50% chance of heads) or biased (75% chance of heads). How likely is it that the coin is fair? Answer this by simulating 50,000 fair coins and 50,000 biased coins.
Simulate 50,000 cases of flipping 20 coins from a fair coin (50% chance of heads), as well as from a biased coin (75% chance of heads). Save these variables as fair
and biased
respectively.
Find the number of fair coins where exactly 11/20 came up heads, then the number of biased coins where exactly 11/20 came up heads. Save them as fair_11
and biased_11
respectively.
Find the fraction of all coins that came up heads 11 times that were fair coins- this is the posterior probability that a coin with 11/20 is fair.
# Simulate 50000 cases of flipping 20 coins from fair and from biased
fair <-rbinom(50000,20,.5)
biased <- rbinom(50000,20,.75)
# How many fair cases, and how many biased, led to exactly 11 heads?
fair_11 <- sum(fair==11)
biased_11 <- sum(biased==11)
# Find the fraction of fair coins that are 11 out of all coins that were 11
fair_11/(fair_11+biased_11)
FALSE [1] 0.8601563
Suppose that when you flip a different coin (that could either be fair or biased) 20 times, you see 16 heads.
Without doing any math, which do you now think is more likely- that this coin is fair, or that it’s biased?
We see 16 out of 20 flips from a coin that is either fair (50% chance of heads) or biased (75% chance of heads). How likely is it that the coin is fair?
Simulate 50,000 cases of flipping 20 coins from a fair coin (50% chance of heads), as well as from a biased coin (75% chance of heads). Save these variables as fair
and biased
respectively.
Find the number of fair coins where exactly 16/20 came up heads, then the number of biased coins where exactly 16/20 came up heads. Save them as fair_16
and biased_16
respectively.
Print the fraction of all coins that came up heads 16 times that were fair coins- this is the posterior probability that a coin with 16/20 is fair.
# Simulate 50000 cases of flipping 20 coins from fair and from biased
fair <- rbinom(50000,20,.5)
biased <- rbinom(50000,20,.75)
# How many fair cases, and how many biased, led to exactly 16 heads?
fair_16 <- sum(fair==16)
biased_16 <- sum(biased==16)
# Find the fraction of fair coins that are 16 out of all coins that were 16
fair_16/(fair_16+biased_16)
FALSE [1] 0.02384546
We see 14 out of 20 flips are heads, and start with a 80% chance the coin is fair and a 20% chance it is biased to 75%.
You’ll solve this case with simulation, by starting with a “bucket” of 10,000 coins, where 8,000 are fair and 2,000 are biased, and flipping each of them 20 times.
Simulate 8,000 trials of flipping a fair coin 20 times and 2,000 trials of flipping a biased coin 20 times. Save them as fair_flips
and biased_flips
, respectively.
Find the number of cases that resulted in 14 heads from each coin, saving them as fair_14
and biased_14
respectively.
Find the fraction of all coins that resulted in 14 heads that were fair: this is an estimate of the posterior probability that the coin is fair.
# Simulate 8000 cases of flipping a fair coin, and 2000 of a biased coin
fair_flips <-rbinom(8000,20,.5)
biased_flips <-rbinom(2000,20,.75)
# Find the number of cases from each coin that resulted in 14/20
fair_14 <-sum(fair_flips==14)
biased_14 <-sum(biased_flips==14)
# Use these to estimate the posterior probability
fair_14/(fair_14+biased_14)
FALSE [1] 0.4791667
Suppose instead of a coin being either fair or biased, there are three possibilities: that the coin is fair (50% heads), low (25% heads), and high (75% heads). There is a 80% chance it is fair, a 10% chance it is biased low, and a 10% chance it is biased high.
You see 14/20 flips are heads. What is the probability that the coin is fair?
Use the rbinom()
function to simulate 80,000 draws from the fair coin, 10,000 draws from the high coin, and 10,000 draws from the low coin, with each draw containing 20 flips. Save them as flips_fair
, flips_high
, and flips_low
, respectively.
For each of these types, compute the number of coins that resulted in 14. Save them as fair_14
, high_14
, and low_14
, respectively.
Find the posterior probability that the coin was fair, by dividing the number of fair coins resulting in 14 from the total number of coins resulting in 14.
# Simulate 80,000 draws from fair coin, 10,000 from each of high and low coins
flips_fair <-rbinom(80000,20,.5)
flips_high <- rbinom(10000,20,.75)
flips_low <- rbinom(10000,20,.25)
# Compute the number of coins that resulted in 14 heads from each of these piles
fair_14 <- sum(flips_fair==14)
high_14 <- sum(flips_high==14)
low_14 <- sum(flips_low==14)
# Compute the posterior probability that the coin was fair
fair_14/(fair_14+high_14+low_14)
FALSE [1] 0.6359348
In this chapter, you used simulation to estimate the posterior probability that a coin that resulted in 11 heads out of 20 is fair. Now you’ll calculate it again, this time using the exact probabilities from dbinom()
. There is a 50% chance the coin is fair and a 50% chance the coin is biased.
Use the dbinom()
function to calculate the exact probability of getting 11 heads out of 20 flips with a fair coin (50% chance of heads) and with a biased coin (75% chance of heads). Save them as probability_fair
and probability_biased
, respectively.
Use these to calculate the posterior probability that the coin is fair. This is the probability that you would get 11 from a fair coin, divided by the sum of the two probabilities.
# Use dbinom to calculate the probability of 11/20 heads with fair or biased coin
probability_fair <-dbinom(11,20,.5)
probability_biased <-dbinom(11,20,.75)
# Calculate the posterior probability that the coin is fair
probability_fair/(probability_fair+probability_biased)
FALSE [1] 0.8554755
In the last exercise, you solved for the probability that the coin is fair if it results in 11 heads out of 20 flips, assuming that beforehand there was an equal chance of it being a fair coin or a biased coin. Recall that the code looked something like:
probability_fair <- dbinom(11, 20, .5)
probability_biased <- dbinom(11, 20, .75)
probability_fair / (probability_fair + probability_biased)
Now you’ll find, using the dbinom()
approach, the posterior probability if there were two other outcomes.
Find the probability that a coin resulting in 14 heads out of 20 flips is fair.
Find the probability that a coin resulting in 18 heads out of 20 flips is fair.
# Find the probability that a coin resulting in 14/20 is fair
dbinom(14,20,.5)/(dbinom(14,20,.75)+dbinom(14,20,.5))
FALSE [1] 0.179811
# Find the probability that a coin resulting in 18/20 is fair
dbinom(18,20,.5)/(dbinom(18,20,.75)+dbinom(18,20,.5))
FALSE [1] 0.002699252
Suppose we see 16 heads out of 20 flips, which would normally be strong evidence that the coin is biased. However, suppose we had set a prior probability of a 99% chance that the coin is fair (50% chance of heads), and only a 1% chance that the coin is biased (75% chance of heads).
You’ll solve this exercise by finding the exact answer with dbinom() and Bayes’ theorem. Recall that Bayes’ theorem looks like:
Use dbinom()
to calculate the probabilities that a fair coin and a biased coin would result in 16 heads out of 20 flips.
Use Bayes’ theorem to find the posterior probability that the coin is fair, given that there is a 99% prior probability that the coin is fair.
# Use dbinom to find the probability of 16/20 from a fair or biased coin
probability_16_fair <-dbinom(16,20,.5)
probability_16_biased <-dbinom(16,20,.75)
# Use Bayes' theorem to find the posterior probability that the coin is fair
(.99*probability_16_fair)/(.99*probability_16_fair+.01*probability_16_biased)
FALSE [1] 0.7068775